智能论文笔记

Multi-modal data fusion of Voice and EMG data for Robotic Control

Tauheed Khan Mohd , Jackson Carvalho , Ahmad Y Javaid

分类：机器人

2022-01-06

可穿戴电子设备不断发展，正在增加人类与技术的集成。以各种形式提供，这些灵活和可弯曲的设备感觉，可以测量人体的生理和肌肉变化，并可以将这些信号用于机器控制。Myo手势频带，一个这样的设备，使用磁电信号捕获电拍摄数据（EMG）并将其转换为通过一些预定义手势用作输入信号。在多模态环境中使用此设备不仅可以增加可以在此类设备的帮助下实现的可能类型的工作类型，而且还可以帮助提高所执行任务的准确性。本文解决了通过麦克风和肌电信号捕获的输入模态的融合，分别通过麦克风和Myo带，以控制机器人臂。还提出了所获得的实验结果以及它们的性能分析的准确性。

translated by 谷歌翻译

Flexible Supervised Autonomy for Exploration in Subterranean Environments

Harel Biggie , Eugene R. Rush , Danny G. Riley , Shakeeb Ahmad , Michael T. Ohradzansky , Kyle Harlow , Michael J. Miles , Daniel Torres , Steve McGuire , Eric W. Frew

分类：机器人

2023-01-02

While the capabilities of autonomous systems have been steadily improving in recent years, these systems still struggle to rapidly explore previously unknown environments without the aid of GPS-assisted navigation. The DARPA Subterranean (SubT) Challenge aimed to fast track the development of autonomous exploration systems by evaluating their performance in real-world underground search-and-rescue scenarios. Subterranean environments present a plethora of challenges for robotic systems, such as limited communications, complex topology, visually-degraded sensing, and harsh terrain. The presented solution enables long-term autonomy with minimal human supervision by combining a powerful and independent single-agent autonomy stack, with higher level mission management operating over a flexible mesh network. The autonomy suite deployed on quadruped and wheeled robots was fully independent, freeing the human supervision to loosely supervise the mission and make high-impact strategic decisions. We also discuss lessons learned from fielding our system at the SubT Final Event, relating to vehicle versatility, system adaptability, and re-configurable communications.

translated by 谷歌翻译

Floods Relevancy and Identification of Location from Twitter Posts using NLP Techniques

Muhammad Suleman , Muhammad Asif , Tayyab Zamir , Ayaz Mehmood , Jebran Khan , Nasir Ahmad , Kashif Ahmad

分类：自然语言处理

2023-01-01

This paper presents our solutions for the MediaEval 2022 task on DisasterMM. The task is composed of two subtasks, namely (i) Relevance Classification of Twitter Posts (RCTP), and (ii) Location Extraction from Twitter Texts (LETT). The RCTP subtask aims at differentiating flood-related and non-relevant social posts while LETT is a Named Entity Recognition (NER) task and aims at the extraction of location information from the text. For RCTP, we proposed four different solutions based on BERT, RoBERTa, Distil BERT, and ALBERT obtaining an F1-score of 0.7934, 0.7970, 0.7613, and 0.7924, respectively. For LETT, we used three models namely BERT, RoBERTa, and Distil BERTA obtaining an F1-score of 0.6256, 0.6744, and 0.6723, respectively.

translated by 谷歌翻译

Relevance Classification of Flood-related Twitter Posts via Multiple Transformers

Wisal Mukhtiar , Waliiya Rizwan , Aneela Habib , Yasir Saleem Afridi , Laiq Hasan , Kashif Ahmad

分类：自然语言处理

2023-01-01

In recent years, social media has been widely explored as a potential source of communication and information in disasters and emergency situations. Several interesting works and case studies of disaster analytics exploring different aspects of natural disasters have been already conducted. Along with the great potential, disaster analytics comes with several challenges mainly due to the nature of social media content. In this paper, we explore one such challenge and propose a text classification framework to deal with Twitter noisy data. More specifically, we employed several transformers both individually and in combination, so as to differentiate between relevant and non-relevant Twitter posts, achieving the highest F1-score of 0.87.

translated by 谷歌翻译

A deep real options policy for sequential service region design and timing

Srushti Rath , Joseph Y. J. Chow

分类：机器学习 | 人工智能

2022-12-30

As various city agencies and mobility operators navigate toward innovative mobility solutions, there is a need for strategic flexibility in well-timed investment decisions in the design and timing of mobility service regions, i.e. cast as "real options" (RO). This problem becomes increasingly challenging with multiple interacting RO in such investments. We propose a scalable machine learning based RO framework for multi-period sequential service region design & timing problem for mobility-on-demand services, framed as a Markov decision process with non-stationary stochastic variables. A value function approximation policy from literature uses multi-option least squares Monte Carlo simulation to get a policy value for a set of interdependent investment decisions as deferral options (CR policy). The goal is to determine the optimal selection and timing of a set of zones to include in a service region. However, prior work required explicit enumeration of all possible sequences of investments. To address the combinatorial complexity of such enumeration, we propose a new variant "deep" RO policy using an efficient recurrent neural network (RNN) based ML method (CR-RNN policy) to sample sequences to forego the need for enumeration, making network design & timing policy tractable for large scale implementation. Experiments on multiple service region scenarios in New York City (NYC) shows the proposed policy substantially reduces the overall computational cost (time reduction for RO evaluation of > 90% of total investment sequences is achieved), with zero to near-zero gap compared to the benchmark. A case study of sequential service region design for expansion of MoD services in Brooklyn, NYC show that using the CR-RNN policy to determine optimal RO investment strategy yields a similar performance (0.5% within CR policy value) with significantly reduced computation time (about 5.4 times faster).

translated by 谷歌翻译

Controllable Mechanical-domain Energy Accumulators

Sung Y. Kim , David J. Braun

分类：机器人

2022-12-29

Springs are efficient in storing and returning elastic potential energy but are unable to hold the energy they store in the absence of an external load. Lockable springs use clutches to hold elastic potential energy in the absence of an external load but have not yet been widely adopted in applications, partly because clutches introduce design complexity, reduce energy efficiency, and typically do not afford high-fidelity control over the energy stored by the spring. Here, we present the design of a novel lockable compression spring that uses a small capstan clutch to passively lock a mechanical spring. The capstan clutch can lock up to 1000 N force at any arbitrary deflection, unlock the spring in less than 10 ms with a control force less than 1 % of the maximal spring force, and provide an 80 % energy storage and return efficiency (comparable to a highly efficient electric motor operated at constant nominal speed). By retaining the form factor of a regular spring while providing high-fidelity locking capability even under large spring forces, the proposed design could facilitate the development of energy-efficient spring-based actuators and robots.

translated by 谷歌翻译

Characterization of the Global Bias Problem in Aerial Federated Learning

Ruslan Zhagypar , Nour Kouzayha , Hesham ElSawy , Hayssam Dahrouj , Tareq Y. Al-Naffouri

分类：机器学习

2022-12-29

Unmanned aerial vehicles (UAVs) mobility enables flexible and customized federated learning (FL) at the network edge. However, the underlying uncertainties in the aerial-terrestrial wireless channel may lead to a biased FL model. In particular, the distribution of the global model and the aggregation of the local updates within the FL learning rounds at the UAVs are governed by the reliability of the wireless channel. This creates an undesirable bias towards the training data of ground devices with better channel conditions, and vice versa. This paper characterizes the global bias problem of aerial FL in large-scale UAV networks. To this end, the paper proposes a channel-aware distribution and aggregation scheme to enforce equal contribution from all devices in the FL training as a means to resolve the global bias problem. We demonstrate the convergence of the proposed method by experimenting with the MNIST dataset and show its superiority compared to existing methods. The obtained results enable system parameter tuning to relieve the impact of the aerial channel deficiency on the FL convergence rate.

translated by 谷歌翻译

Hungry Hungry Hippos: Towards Language Modeling with State Space Models

Tri Dao , Daniel Y. Fu , Khaled K. Saab , Armin W. Thomas , Atri Rudra , Christopher Ré

分类：机器学习 | 自然语言处理

2022-12-28

State space models (SSMs) have demonstrated state-of-the-art sequence modeling performance in some modalities, but underperform attention in language modeling. Moreover, despite scaling nearly linearly in sequence length instead of quadratically, SSMs are still slower than Transformers due to poor hardware utilization. In this paper, we make progress on understanding the expressivity gap between SSMs and attention in language modeling, and on reducing the hardware barrier between SSMs and attention. First, we use synthetic language modeling tasks to understand the gap between SSMs and attention. We find that existing SSMs struggle with two capabilities: recalling earlier tokens in the sequence and comparing tokens across the sequence. To understand the impact on language modeling, we propose a new SSM layer, H3, that is explicitly designed for these abilities. H3 matches attention on the synthetic languages and comes within 0.4 PPL of Transformers on OpenWebText. Furthermore, a hybrid 125M-parameter H3-attention model that retains two attention layers surprisingly outperforms Transformers on OpenWebText by 1.0 PPL. Next, to improve the efficiency of training SSMs on modern hardware, we propose FlashConv. FlashConv uses a fused block FFT algorithm to improve efficiency on sequences up to 8K, and introduces a novel state passing algorithm that exploits the recurrent properties of SSMs to scale to longer sequences. FlashConv yields 2$\times$ speedup on the long-range arena benchmark and allows hybrid language models to generate text 1.6$\times$ faster than Transformers. Using FlashConv, we scale hybrid H3-attention language models up to 1.3B parameters on the Pile and find promising initial results, achieving lower perplexity than Transformers and outperforming Transformers in zero- and few-shot learning on a majority of tasks in the SuperGLUE benchmark.

translated by 谷歌翻译

MyI-Net: Fully Automatic Detection and Quantification of Myocardial Infarction from Cardiovascular MRI Images

Shuihua Wang , Ahmed M. S. E. K Abdelaty , Kelly Parke , J Ranjit Arnold , Gerry P McCann , Ivan Y Tyukin

分类：计算机视觉 | 机器学习

2022-12-28

A "heart attack" or myocardial infarction (MI), occurs when an artery supplying blood to the heart is abruptly occluded. The "gold standard" method for imaging MI is Cardiovascular Magnetic Resonance Imaging (MRI), with intravenously administered gadolinium-based contrast (late gadolinium enhancement). However, no "gold standard" fully automated method for the quantification of MI exists. In this work, we propose an end-to-end fully automatic system (MyI-Net) for the detection and quantification of MI in MRI images. This has the potential to reduce the uncertainty due to the technical variability across labs and inherent problems of the data and labels. Our system consists of four processing stages designed to maintain the flow of information across scales. First, features from raw MRI images are generated using feature extractors built on ResNet and MoblieNet architectures. This is followed by the Atrous Spatial Pyramid Pooling (ASPP) to produce spatial information at different scales to preserve more image context. High-level features from ASPP and initial low-level features are concatenated at the third stage and then passed to the fourth stage where spatial information is recovered via up-sampling to produce final image segmentation output into: i) background, ii) heart muscle, iii) blood and iv) scar areas. New models were compared with state-of-art models and manual quantification. Our models showed favorable performance in global segmentation and scar tissue detection relative to state-of-the-art work, including a four-fold better performance in matching scar pixels to contours produced by clinicians.

translated by 谷歌翻译

Knowledge-Guided Data-Centric AI in Healthcare: Progress, Shortcomings, and Future Directions

Edward Y. Chang

分类：人工智能 | 机器学习

2022-12-27

The success of deep learning is largely due to the availability of large amounts of training data that cover a wide range of examples of a particular concept or meaning. In the field of medicine, having a diverse set of training data on a particular disease can lead to the development of a model that is able to accurately predict the disease. However, despite the potential benefits, there have not been significant advances in image-based diagnosis due to a lack of high-quality annotated data. This article highlights the importance of using a data-centric approach to improve the quality of data representations, particularly in cases where the available data is limited. To address this "small-data" issue, we discuss four methods for generating and aggregating training data: data augmentation, transfer learning, federated learning, and GANs (generative adversarial networks). We also propose the use of knowledge-guided GANs to incorporate domain knowledge in the training data generation process. With the recent progress in large pre-trained language models, we believe it is possible to acquire high-quality knowledge that can be used to improve the effectiveness of knowledge-guided generative methods.

translated by 谷歌翻译